仅使用单视2D照片的收藏集对3D感知生成对抗网络(GAN)的无监督学习最近取得了很多进展。然而,这些3D gan尚未证明人体,并且现有框架的产生的辐射场不是直接编辑的,从而限制了它们在下游任务中的适用性。我们通过开发一个3D GAN框架来解决这些挑战的解决方案,该框架学会在规范的姿势中生成人体或面部的辐射场,并使用显式变形场将其扭曲成所需的身体姿势或面部表达。使用我们的框架,我们展示了人体的第一个高质量的辐射现场生成结果。此外,我们表明,与未接受明确变形训练的3D GAN相比,在编辑其姿势或面部表情时,我们的变形感知训练程序可显着提高产生的身体或面部的质量。
translated by 谷歌翻译
最近关于机器学习和优化集成的研究的扩散。该研究流中的一个膨胀区域是预测模型嵌入式优化,其使用预先接受训练的预测模型来实现优化问题的目标函数,因此预测模型的特征成为优化问题中的决策变量。尽管该领域最近出版物飙升,但这一决策管道的一个方面已经很大程度上被忽视的是培训相关性,即确保对优化问题的解决方案应该类似于用于训练预测模型的数据。在本文中,我们提出了旨在实施培训相关性的限制,并通过集合来展示添加建议的约束显着提高所获得的溶液质量。
translated by 谷歌翻译
我们研究了通过具有整流线性单元(Relu)激活的前馈神经网络建模目标函数的优化问题。最近的文献已经探讨了单一神经网络的使用来模拟目标函数内的不确定或复杂元素。然而,众所周知,神经网络的集合产生更稳定的预测,并且具有比具有单个神经网络的模型更好的普遍性,这表明在决策管道中应用神经网络的集合。我们研究如何将神经网络集合纳入优化模型的客观函数,并探索随后的问题的计算方法。我们基于现有流行的大量M $配方提供了一种混合整数线性程序,以优化单个神经网络。我们为我们的模型开发了两个加速技术,首先是一种预处理程序,用于拧紧神经网络中的关键神经元的界限,而第二个是基于弯曲分解的一组有效的不等式。我们解决方案方法的实验评估在一个全球优化问题和两个现实世界数据集中进行;结果表明,我们的优化算法在计算时间和最优性间隙方面优于最先进的方法的适应。
translated by 谷歌翻译
Implicitly defined, continuous, differentiable signal representations parameterized by neural networks have emerged as a powerful paradigm, offering many possible benefits over conventional representations. However, current network architectures for such implicit neural representations are incapable of modeling signals with fine detail, and fail to represent a signal's spatial and temporal derivatives, despite the fact that these are essential to many physical signals defined implicitly as the solution to partial differential equations. We propose to leverage periodic activation functions for implicit neural representations and demonstrate that these networks, dubbed sinusoidal representation networks or SIRENs, are ideally suited for representing complex natural signals and their derivatives. We analyze SIREN activation statistics to propose a principled initialization scheme and demonstrate the representation of images, wavefields, video, sound, and their derivatives. Further, we show how SIRENs can be leveraged to solve challenging boundary value problems, such as particular Eikonal equations (yielding signed distance functions), the Poisson equation, and the Helmholtz and wave equations. Lastly, we combine SIRENs with hypernetworks to learn priors over the space of SIREN functions. Please see the project website for a video overview of the proposed method and all applications.
translated by 谷歌翻译
In this paper, we propose a novel technique, namely INVALIDATOR, to automatically assess the correctness of APR-generated patches via semantic and syntactic reasoning. INVALIDATOR reasons about program semantic via program invariants while it also captures program syntax via language semantic learned from large code corpus using the pre-trained language model. Given a buggy program and the developer-patched program, INVALIDATOR infers likely invariants on both programs. Then, INVALIDATOR determines that a APR-generated patch overfits if: (1) it violates correct specifications or (2) maintains errors behaviors of the original buggy program. In case our approach fails to determine an overfitting patch based on invariants, INVALIDATOR utilizes a trained model from labeled patches to assess patch correctness based on program syntax. The benefit of INVALIDATOR is three-fold. First, INVALIDATOR is able to leverage both semantic and syntactic reasoning to enhance its discriminant capability. Second, INVALIDATOR does not require new test cases to be generated but instead only relies on the current test suite and uses invariant inference to generalize the behaviors of a program. Third, INVALIDATOR is fully automated. We have conducted our experiments on a dataset of 885 patches generated on real-world programs in Defects4J. Experiment results show that INVALIDATOR correctly classified 79% overfitting patches, accounting for 23% more overfitting patches being detected by the best baseline. INVALIDATOR also substantially outperforms the best baselines by 14% and 19% in terms of Accuracy and F-Measure, respectively.
translated by 谷歌翻译
The recent increase in public and academic interest in preserving biodiversity has led to the growth of the field of conservation technology. This field involves designing and constructing tools that utilize technology to aid in the conservation of wildlife. In this article, we will use case studies to demonstrate the importance of designing conservation tools with human-wildlife interaction in mind and provide a framework for creating successful tools. These case studies include a range of complexities, from simple cat collars to machine learning and game theory methodologies. Our goal is to introduce and inform current and future researchers in the field of conservation technology and provide references for educating the next generation of conservation technologists. Conservation technology not only has the potential to benefit biodiversity but also has broader impacts on fields such as sustainability and environmental protection. By using innovative technologies to address conservation challenges, we can find more effective and efficient solutions to protect and preserve our planet's resources.
translated by 谷歌翻译
Variational autoencoders model high-dimensional data by positing low-dimensional latent variables that are mapped through a flexible distribution parametrized by a neural network. Unfortunately, variational autoencoders often suffer from posterior collapse: the posterior of the latent variables is equal to its prior, rendering the variational autoencoder useless as a means to produce meaningful representations. Existing approaches to posterior collapse often attribute it to the use of neural networks or optimization issues due to variational approximation. In this paper, we consider posterior collapse as a problem of latent variable non-identifiability. We prove that the posterior collapses if and only if the latent variables are non-identifiable in the generative model. This fact implies that posterior collapse is not a phenomenon specific to the use of flexible distributions or approximate inference. Rather, it can occur in classical probabilistic models even with exact inference, which we also demonstrate. Based on these results, we propose a class of latent-identifiable variational autoencoders, deep generative models which enforce identifiability without sacrificing flexibility. This model class resolves the problem of latent variable non-identifiability by leveraging bijective Brenier maps and parameterizing them with input convex neural networks, without special variational inference objectives or optimization tricks. Across synthetic and real datasets, latent-identifiable variational autoencoders outperform existing methods in mitigating posterior collapse and providing meaningful representations of the data.
translated by 谷歌翻译
Cashews are grown by over 3 million smallholders in more than 40 countries worldwide as a principal source of income. As the third largest cashew producer in Africa, Benin has nearly 200,000 smallholder cashew growers contributing 15% of the country's national export earnings. However, a lack of information on where and how cashew trees grow across the country hinders decision-making that could support increased cashew production and poverty alleviation. By leveraging 2.4-m Planet Basemaps and 0.5-m aerial imagery, newly developed deep learning algorithms, and large-scale ground truth datasets, we successfully produced the first national map of cashew in Benin and characterized the expansion of cashew plantations between 2015 and 2021. In particular, we developed a SpatioTemporal Classification with Attention (STCA) model to map the distribution of cashew plantations, which can fully capture texture information from discriminative time steps during a growing season. We further developed a Clustering Augmented Self-supervised Temporal Classification (CASTC) model to distinguish high-density versus low-density cashew plantations by automatic feature extraction and optimized clustering. Results show that the STCA model has an overall accuracy of 80% and the CASTC model achieved an overall accuracy of 77.9%. We found that the cashew area in Benin has doubled from 2015 to 2021 with 60% of new plantation development coming from cropland or fallow land, while encroachment of cashew plantations into protected areas has increased by 70%. Only half of cashew plantations were high-density in 2021, suggesting high potential for intensification. Our study illustrates the power of combining high-resolution remote sensing imagery and state-of-the-art deep learning algorithms to better understand tree crops in the heterogeneous smallholder landscape.
translated by 谷歌翻译
Coronary Computed Tomography Angiography (CCTA) provides information on the presence, extent, and severity of obstructive coronary artery disease. Large-scale clinical studies analyzing CCTA-derived metrics typically require ground-truth validation in the form of high-fidelity 3D intravascular imaging. However, manual rigid alignment of intravascular images to corresponding CCTA images is both time consuming and user-dependent. Moreover, intravascular modalities suffer from several non-rigid motion-induced distortions arising from distortions in the imaging catheter path. To address these issues, we here present a semi-automatic segmentation-based framework for both rigid and non-rigid matching of intravascular images to CCTA images. We formulate the problem in terms of finding the optimal \emph{virtual catheter path} that samples the CCTA data to recapitulate the coronary artery morphology found in the intravascular image. We validate our co-registration framework on a cohort of $n=40$ patients using bifurcation landmarks as ground truth for longitudinal and rotational registration. Our results indicate that our non-rigid registration significantly outperforms other co-registration approaches for luminal bifurcation alignment in both longitudinal (mean mismatch: 3.3 frames) and rotational directions (mean mismatch: 28.6 degrees). By providing a differentiable framework for automatic multi-modal intravascular data fusion, our developed co-registration modules significantly reduces the manual effort required to conduct large-scale multi-modal clinical studies while also providing a solid foundation for the development of machine learning-based co-registration approaches.
translated by 谷歌翻译
Springs are efficient in storing and returning elastic potential energy but are unable to hold the energy they store in the absence of an external load. Lockable springs use clutches to hold elastic potential energy in the absence of an external load but have not yet been widely adopted in applications, partly because clutches introduce design complexity, reduce energy efficiency, and typically do not afford high-fidelity control over the energy stored by the spring. Here, we present the design of a novel lockable compression spring that uses a small capstan clutch to passively lock a mechanical spring. The capstan clutch can lock up to 1000 N force at any arbitrary deflection, unlock the spring in less than 10 ms with a control force less than 1 % of the maximal spring force, and provide an 80 % energy storage and return efficiency (comparable to a highly efficient electric motor operated at constant nominal speed). By retaining the form factor of a regular spring while providing high-fidelity locking capability even under large spring forces, the proposed design could facilitate the development of energy-efficient spring-based actuators and robots.
translated by 谷歌翻译